Genome-Wide Prediction of Metabolic Enzymes, Pathways, and Gene Clusters in Plants.
نویسندگان
چکیده
Plant metabolism underpins many traits of ecological and agronomic importance. Plants produce numerous compounds to cope with their environments but the biosynthetic pathways for most of these compounds have not yet been elucidated. To engineer and improve metabolic traits, we need comprehensive and accurate knowledge of the organization and regulation of plant metabolism at the genome scale. Here, we present a computational pipeline to identify metabolic enzymes, pathways, and gene clusters from a sequenced genome. Using this pipeline, we generated metabolic pathway databases for 22 species and identified metabolic gene clusters from 18 species. This unified resource can be used to conduct a wide array of comparative studies of plant metabolism. Using the resource, we discovered a widespread occurrence of metabolic gene clusters in plants: 11,969 clusters from 18 species. The prevalence of metabolic gene clusters offers an intriguing possibility of an untapped source for uncovering new metabolite biosynthesis pathways. For example, more than 1,700 clusters contain enzymes that could generate a specialized metabolite scaffold (signature enzymes) and enzymes that modify the scaffold (tailoring enzymes). In four species with sufficient gene expression data, we identified 43 highly coexpressed clusters that contain signature and tailoring enzymes, of which eight were characterized previously to be functional pathways. Finally, we identified patterns of genome organization that implicate local gene duplication and, to a lesser extent, single gene transposition as having played roles in the evolution of plant metabolic gene clusters.
منابع مشابه
Genome-wide computational prediction of miRNAs in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) revealed target genes involved in pulmonary vasculature and antiviral innate immunity
The current outbreak of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)in China threatened humankind worldwide. The coronaviruses contains the largest RNA genome among all other known RNA viruses, therefore the disease etiology can be understood by analyzing the genome sequence of SARS-CoV-2. In this study, we used an ab-intio based computational tool VMir to scan the complete geno...
متن کاملThe in Silico Characterization of a Salicylic Acid Analogue Coding Gene Clusters in Selected Pseudomonas Fluorescens Strains
Background: The microbial genome sequences provide solid in silico framework for interpretation their drug-like chemical scaffolds biosynthetic potential. The Pseudomonas fluorescens species is metabolically versatile and producing therapeutically important natural products.Objectives: The main objective of the present study was to mine the publically available data of P. fluorescens stra...
متن کاملGenome-wide Association Study to Identify Genes and Biological Pathways Associated with Type Traits in Cattle using Pathway Analysis
Extended Abstract Introduction and Objective: Type traits describing the skeletal characteristics of an animal are moderately to strongly genetically correlate with other economically important traits in cattle including fertility, longevity and carcass traits. The present study aimed to conduct a genome wide association studies (GWAS) based on gene-set enrichment analysis for identifying the ...
متن کاملPrediction and characterization of enzymatic activities guided by sequence similarity and genome neighborhood networks
Metabolic pathways in eubacteria and archaea often are encoded by operons and/or gene clusters (genome neighborhoods) that provide important clues for assignment of both enzyme functions and metabolic pathways. We describe a bioinformatic approach (genome neighborhood network; GNN) that enables large scale prediction of the in vitro enzymatic activities and in vivo physiological functions (meta...
متن کاملBioinformatics Genome-Wide Characterization of the WRKY Gene Family in Sorghum bicolor
The WRKY gene family encodes a large group of transcription factors that regulate genes involved in plant response to biotic and abiotic stresses. Sorghum is a notable grain and forage crop in semi-arid regions because of its unusual tolerance against hot and dry environments. We identified a set of 85 WRKY genes in the S. bicolor genome and classified them into three groups (I–III). Among the ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Plant physiology
دوره 173 4 شماره
صفحات -
تاریخ انتشار 2017